7x1-PT: um Corpus extraído do Twitter para Análise de Sentimentos em Língua Portuguesa (7x1-PT: a Corpus extracted from Twitter for Sentiment Analysis in Portuguese Language)
نویسندگان
چکیده
This paper describes the 7x1PT corpus that contains a set of tweets, in Portuguese, posted during the match Germany vs Brazil at the FIFA World Cup 2014. We describe data collection, cleaning and organization, and also the current stage of the linguistic annotation of this corpus.
منابع مشابه
RePort - Um Sistema de Extração de Informações Aberta para Língua Portuguesa (Report - An Open Information Extraction System for Portuguese Language)
An emerging field of research in Natural Language Processing (NLP) proposes Open Information Extraction systems (Open IE). Open IEs follow a domain-independent extraction paradigm that uses generic patterns to extract all relationships between entities. In this work, we present RePort, a method of Open IE for Portuguese, based on the ReVerb, an approach for English. Adaptations of syntactic and...
متن کاملA study on irony within the context of 7x1-PT corpus
The increasing use of social networks to express consumer opinions yields a large amount of potentially useful information for organizations to gauge consumer perception of their products. Nevertheless, gauging information by assigning polarities to opinionated text is not a trivial task, especially when dealing with short and ironical text. In this paper, we evaluate the presence of irony at t...
متن کاملMineração de emoções em textos multilíngues usando um corpus paralelo
Multilingual Opinion Mining deals with the analysis of opinions, regardless of the language in which they are written. Works in this area focus on the classification of the polarity of opinions extracted from texts, and less attention has been paid to the classification of emotions. This work proposes the use of Multilingual Opinion Mining techniques for emotion mining using parallel corpora. W...
متن کاملAnálise Adaptativa de Fluxo de Sentimento Baseada em Janela Deslizante Ativa
In recent years, the task of sentiment analysis has attracted much interest from the machine learning community. Considering the benefits offered by this analysis, it is increasingly necessary to analyze feelings and opinions that are expressed continuously in sentiment streams provided by users in social media channels. Many automatic classification techniques have been used to perform this se...
متن کاملIdentificação de Autoria de Textos através do uso de Classes Linguísticas da Língua Portuguesa (Authorship Identification Using Linguistic Classes for Portuguese) [in Portuguese]
The computational solution uses to solve problems related to the authorship identification and verification has grown progressively in areas such as computing, linguistics and law. This article aims to provide a method for the identification of authors ot text, based on a conjunct of attributes stilometry, using on the characteristics of Portuguese language. Resumo. A utilização do meio computa...
متن کامل